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Microvirga lupini LUT6^ is an aerobic, non-motile. Gram-negative, non-spore-forming rod 
that can exist as a soil saprophyte or as a legume microsymbiont of Lupinus texensis. 
LUT6^ was isolated in 2006 from a nodule recovered from the roots of the annual L. 
texensis growing in Travis Co., Texas. LUT6^ forms a highly specific nitrogen-fixing sym- 
biosis with endemic L. texensis and no other Lupinus species can form an effective nitro- 
gen-fixing symbiosis with this isolate. Here we describe the features of M. lupini LUT6\ 
together with genome sequence information and its annotation. The 9,633,614 bp im- 
proved high quality draft genome is arranged into 160 scaffolds of 1,366 contigs contain- 
ing 10,864 protein-coding genes and 87 RNA-only encoding genes, and is one of 20 
rhizobia! genomes sequenced as part of a DOE Joint Genome Institute 2010 Community 
Sequencing Project. 



Introduction 



Microvirga is one of the most recently discov- 
ered genera of Proteobacteria known to engage 
in symbiotic nitrogen fixation with legume 
plants, and joins a diverse set of at least twelve 
other lineages of Proteobacteria that share this 
ecological niche [1-4]. Several genera of legume 
root-nodule symbionts have a world-wide dis- 
tribution and interact with many legume taxa. 
By contrast, symbiotic strains of Microvirga are 
currently known from two distant locations and 
only two legume host genera [5,6]. The limited 
geographic and host distribution of Microvirga 
symbionts, along with the fact that root-nodule 
symbiosis is not characteristic of the genus 
Microvirga as a whole [7], suggest a relatively 
recent evolutionary transition to legume symbi- 
osis in this group. 



Mexico [5]. The genus Lupinus has about 270 
annual and perennial species concentrated in 
western North America and in Andean regions 
of South America, and a much smaller number of 
species in the Mediterranean region of Europe 
and northern Africa [8]. Basal lineages of 
Lupinus all occur in the Mediterranean and are 
associated with bacterial symbionts in the genus 
Bradyrhizobium [9,10]. Bradyrhizobium is also 
the main symbiont hneage for most Lupinus spe- 
cies in North and South America, although a few 
Lupinus species utilize nodule bacteria in the ge- 
nus Mesorhizobium [10-13]. Thus, the acquisi- 
tion of symbionts in the genus Microvirga by 
plants of L. texensis appears to be an unusual, 
derived condition for this legume genus. 



L. texensis occurs in grassland and open shrub 
communities with an annual precipitation of 50 - 
100 cm, on diverse soil types [14]. L. texensis ap- 
pears to have a specialized symbiotic relation- 
ship with M. lupini in that existing surveys have 



M. lupini is a specialized nodule symbiont asso- 
ciated with the legume Lupinus texensis, an an- 
nual plant endemic to a relatively small geo- 
graphic area in central Texas and northeastern 
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failed to detect nodule symbionts of any other 
bacterial genus associated with this plant [5]. 
Moreover, inoculation experiments with other 
North American species of Lupinus, as well as 
other legume genera, have so far failed to identi- 
fy any plant besides L. texensis that is capable of 
forming an effective, nitrogen-fixing symbiosis 
with M. lupini [5]. M. lupini strain Lut6T was iso- 
lated from a nodule collected from a L. texensis 
plant in Travis Co., Texas in 2006. Here we pro- 
vide an analysis of the complete genome se- 
quence of M. lupini strain Lut6T; one of the three 
described symbiotic species of Microvirga [15]. 

Classification and general features 

M. lupini LUT6T is a non-motile. Gram-negative 
rod in the order Rhizobiales of the class Alpha- 
proteobacteria. The rod-shaped form varies in 



size with dimensions of 1.0 |im for width and 
1.5-2.0 |im for length [Figure 1 Left and Center). 

It is fast growing, forming colonies within 3-4 
days when grown on half strength Lupin Agar 
[%LA) [16], tryptone-yeast extract agar (TY) 
[17] or a modified yeast-mannitol agar [YMA) 
[18] at 28°C. Colonies on %LA are white-opaque, 
slightly domed and moderately mucoid with 
smooth margins [Figure 1 Right). 

Minimum Information about the Genome Se- 
quence [MIGS) is provided in Table 1. Figure 2 
shows the phylogenetic neighbor-hood of M. 
lupini LUT6T in a 16S rRNA sequence based tree. 
This strain shares 100% (1,358/1,358 bases) 
and 98% (1,344/1,367 bases) sequence identity 
to the 16S rRNA of Microvirga sp. LutS and 
Microvirga lotononidis WSM35S7'^, respectively. 



Table 1 . Classification and general features of M. lupini LUT6^ according to the MIGS recommendations [19,20] 



MIGS ID 


Property 


Term 


Evidence code 






Domain Bacteria 


TAS [20] 






Phylum Proteobacteria 


TAS [21] 






Class Alphaproteobacteria 


TAS [22,23] 




Current classification 


Order Rhizobiales 

Family Methylobacteriaceae 
Genus Microvirga 


TAS [23,25[ 
TAS [15,26-28] 






Species Microvirga lupini 
Strain LUTG'^ 


TAS [15] 










Gram stain 


Negative 


TAS [15] 




Cell shape 


Rod 


TAS [15] 




Motility 


Non-Motile 


IDA 




Sporulation 


Non-sporulating 


TAS [15] 




Temperature range 


Mesophile 


TAS [15] 




Optimum temperature 


39°C 


TAS [15] 




Salinity 


Non-halophile 


TAS [15] 


MIGS-22 


Oxygen requirement 


Aerobic 


TAS [15] 




Carbon source 


Varied 


TAS [15] 




Energy source 


Chemoorganotroph 


TAS [15] 


MIGS-6 


Habitat 


Soil, root nodule, on host 


TAS [15] 


MIGS-15 


Biotic relationship 


Free living, symbiotic 


TAS [15] 


MlGS-14 


Pathogenicity 


Non-pathogenic 


NAS 




Biosafety level 


1 


TAS [29] 




Isolation 


Root nodule of Lupinus texensis 


TAS [5[ 


MIGS-4 


Geographic location 


Travis Co., Texas 


TAS [5[ 


MIGS-5 


Soil collection date 


03 Jan 2006 


IDA 


MIGS-4.1 


Latitude 


-97.838 


IDA 


MIGS-4.2 


Longitude 


30.459 


IDA 


MIGS-4.3 


Depth 


0-1 0 cm 


IDA 


MIGS-4.4 


Altitude 


270 m 


IDA 



Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in 
the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but 
based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the 
Gene Ontology project [30]. 



1160 



Standards in Genomic Sciences 



Reeve et al. 





Figure 1 . Images of M. lupini LUT6^ using scanning (Left) and transmission (Center) elec- 
tron microscopy and the appearance of colony morphology on solid medium (Right). 
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Figure 2. Phylogenetic tree showing the relationship of M. lupini LUT6^ (shown in bold print) to other root nodule 
bacteria in the order Rhizobiales based on aligned sequences of the 16S rRNA gene (1,320 bp internal region). All 
sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, 
version 5 [31]. The tree was built using the Maximum-Likelihood method with the General Time Reversible model 
[32]. Bootstrap analysis [33[ with 500 replicates was performed to assess the support of the clusters. Type strains 
are indicated with a superscript T. Brackets after the strain name contain a DNA database accession number 
and/or a GOLD ID (beginning with the prefix G) for a sequencing project registered in GOLD [34[. Published ge- 
nomes are indicated with an asterisk. 
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Symbiotaxonomy 

M. lupini strain LutG^ was isolated in from a 
nodule collected from Lupinus texensis growing 
near Travis Co., Texas. The symbiotic character- 



istics of this isolate on a range of selected hosts 
are provided in Table 2. 



1 ooiimo ^nor'ioG 


Xlrkfliil^f ii^n 

I^UULIICtilUII 




V^UIIIIIICIII. 


Lupinus texensis 


Nod"' 


Fix^ 


Highly effective 


Lupinus perennis 


Nod" 


Fix" 


No nodulation 


Lupinus succulentus 


Nod" 


Fix" 


No nodulation 


Lupinus microcarpus 


Nod" 


Fix" 


No nodulation 


Phaseolus vulgaris 


Nod" 


Fix" 


No nodulation 


Macroptilium atropurpureum 


Nod+ 


Fix" 


No fixation 


Desmodium canadense 


Nod" 


Fix" 


No nodulation 


Cytisus scoparius 


Nod+ 


Fix" 


No fixation 


Mimosa pudica 


Nod" 


Fix" 


No nodulation 



^Data compiled [5]. Note that '+' and '-' denote presence or absence, respectively, 
of nodulation (Nod) or Nj fixation (Fix). 



Genome sequencing and annotation 
Genome project history 

This organism was selected for sequencing on 
the basis of its environmental and agricultural 
relevance to issues in global carbon cycling, al- 
ternative energy production, and biogeochemi- 
cal importance, and is part of the Community 
Sequencing Program at the U.S. Department of 
Energy, Joint Genome Institute QGI) for projects 



of relevance to agency missions. The genome 
project is deposited in the Genomes OnLine Da- 
tabase [34] and an improved-high-quality-draft 
genome sequence in IMG. Sequencing, finishing 
and annotation were performed by the JGI. A 
summary of the project information is shown in 
Table 3. 



Table 3. Genome sequencing project information for M. lupini LUT6^. 



MIGS ID 


Property 


Term 


MlGS-31 


Finishing quality 


Improved high-quality draft 


MIGS-28 
MlGS-29 
MlGS-31.2 


Libraries used 
Sequencing platforms 
Sequencing coverage 


lllumina GAii shotgun and a paired 
end 454 libraries 

lllumina GAii and 454 GS FLX Titani- 
um technologies 

3.5x 454 paired end, 300x lllumina 


MIGS-30 
MlGS-32 


Assemblers 

Gene calling methods 


Velvet version 1.0.13; New/bler 2.3, 

phrapSPS-4.24 

Prodigal 1 .4 




GOLD ID 


Gi06478 




NCBI project ID 


66529 




Database: IMG 


2508501050 




Project relevance 


Symbiotic Nj fixation, agriculture 
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Growth conditions and DNA isolation 

M. lupini LUT6T was cultured to mid logarithmic 
phase in 60 ml of TY rich media [35] on a gyra- 
tory shaker at 28°C. DNA was isolated from the 
cells using a CTAB (Cetyl trimethyl ammonium 
bromide) bacterial genomic DNA isolation 
method [36]. 

Genome sequencing and assembly 

The genome of M. lupini LUT6T was sequenced at 
the Joint Genome Institute QGI] using a combi- 
nation of Illumina [37] and 454 technologies 
[38]. An Illumina GAii shotgun library which 
generated 77,090,752 reads totahng 5,858.9 
Mbp, and a paired end 454 library with an aver- 
age insert size of 8 Kbp which generated 
238,026 reads totaling 81.4 Mb of 454 data were 
generated for this genome [36]. 

All general aspects of library construction and 
sequencing performed at the JGI can be found at 
[36]. The initial draft assembly contained 1,719 
contigs in 6 scaffolds. The 454 paired end data 
were assembled together with Newbler, version 
2.3-PreRelease-6/30/2009. The Newbler con- 
sensus sequences were computationally shred- 
ded into 2 Kbp overlapping fake reads [shreds). 
Illumina sequencing data was assembled with 
VELVET, version 1.0.13 [39], and the consensus 
sequence computationally shredded into 1.5 
Kbp overlapping fake reads (shreds). The 454 
Newbler consensus shreds, the Illumina VELVET 
consensus shreds and the read pairs in the 454 
paired end library were integrated using parallel 
phrap, version SPS - 4.24 (High Performance 
Software, LLC). The software Consed [40-42] 
was used in the following finishing process. 
Illumina data was used to correct potential base 
errors and increase consensus quality using the 
software Polisher developed at JGI [43]. Possible 
mis-assemblies were corrected using 



gapResolution (Cliff Han, unpublished) or 
Dupfinisher [44]. Some gaps between contigs 
were closed by editing in Consed. The estimated 
genome size is 10.3 Mb and the final assembly is 
based on 36.2 Mb of 454 draft data which pro- 
vides an average 3.5x coverage of the genome 
and 3,090 Mbp of Illumina draft data which pro- 
vides an average 300x coverage of the genome. 

Genome annotation 

Genes were identified using Prodigal [45] as part 
of the DOE-JGI annotation pipeline [46]. The 
predicted CDSs were translated and used to 
search the National Center for Biotechnology In- 
formation (NCBI) nonredundant database, 
UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, 
and InterPro databases. The tRNAScanSE tool 
[47] was used to find tRNA genes, whereas ribo- 
somal RNA genes were found by searches 
against models of the ribosomal RNA genes built 
from SILVA [48]. Other non-coding RNAs such 
as the RNA components of the protein secretion 
complex and the RNase P were identified by 
searching the genome for the corresponding 
Rfam profiles using INFERNAL [49]. Additional 
gene prediction analysis and manual functional 
annotation was performed within the Integrated 
Microbial Genomes (IMG-ER) platform [50]. 

Genome properties 

The genome is 9,633,614 nucleotides long with 
60.26% GC content (Table 4) and comprised of 
160 scaffolds (Figure 3) of 1,366 contigs. From a 
total of 10,951 genes, 10,864 were protein en- 
coding and 87 RNA only encoding genes. The 
majority of genes (63.25%) were assigned a pu- 
tative function whilst the remaining genes were 
annotated as hypothetical. The distribution of 
genes into COGs functional categories is pre- 
sented in Table 5. 



Table 4. Genome statistics for Microvirga lupini LUT6^ 


Attribute 


Value 


% of Total 


Genome size (bp) 


9,633,614 


100.00 


DNA coding region (bp) 


7,880,506 


81.80 


DNA G+C content (bp) 


5,805,078 


60.26 


Number of scaffolds 


160 




Number of contigs 


1,366 




Total genes 


10,951 


100.00 


RNA genes 


87 


0.79 


rRNA operons 


1 


0.01 


Protein-coding genes 


10,864 


99.21 


Genes with function prediction 


6,927 


63.25 


Genes assigned to COGs 


6,990 


63.83 


Genes assigned Pfam domains 


7,343 


67.05 


Genes with signal peptides 


768 


7.01 


Genes with transmembrane helices 


2,006 


18.32 


CRISPR repeats 


0 
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MicrovirBa Lut6_MicLut6DRAFT_MLH.129 Microvcrga Lut6_MicLut6DRAf T_MLH.4 




Microvirga Lut6_MlcLut6DRAFT_MLH.57 




Microvirga Lut6_MicLut6DRAFT_MLH.51 



! ! ! 

Figure 3. Graphical map of the genome of Microvirga lupini LUT6^ showing the 
four largest scaffolds. From bottom to the top of each scaffold; Genes on forward 
strand (color by COG categories as denoted by the IMG platform), Genes on re- 
verse strand (color by COG categories), RNA genes (tRNAs green, sRNAs red, 
other RNAs black), GC content, GC skew. 




Table 5. Number of protein coding genes of Microvirga lupini LUT6^ associated with the general COG func- 
tional categories. 



Code 


Value 


%age 


COG Category 


J 


209 


2.72 


Translation, ribosomal structure and biogenesis 


A 


1 


0.01 


RNA processing and modification 


K 


571 


7.43 


Transcription 


L 


667 


8.68 


Replication, recombination and repair 


B 


10 


0.13 


Chromatin structure and dynamics 


D 


53 


0.69 


Cell cycle control, mitosis and meiosis 


Y 






Nuclear structure 


V 


104 


1.35 


Defense mechanisms 


T 


463 


6.02 


Signal transduction mechanisms 


M 


316 


4.11 


Cell wall/membrane biogenesis 


N 


69 


0.9 


Cell motility 


Z 


0 


0 


Cytoskeleton 


w 


1 


0.01 


Extracellular structures 


u 


95 


1.24 


Intracellular trafficking and secretion 


o 


249 


3.24 


Posttranslational modification, protein turnover. 


C 


401 


5.22 


Energy production conversion 


G 


602 


7.83 


Carbohydrate transport and metabolism 


E 


828 


10.77 


Amino acid transport metabolism 


F 


100 


1.3 


Nucleotide transport and metabolism 


H 


263 


3.42 


Coenzyme transport and metabolism 


1 


266 


3.46 


Lipid transport and metabolism 


P 


388 


5.05 


Inorganic ion transport and metabolism 


Q 


263 


3.42 


Secondary metabolite biosynthesis, transport and ca- 


R 


976 


12.70 


General function prediction only 


s 


790 


10.28 


Function unknown 




3,961 


36.17 


Not in COGS 
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